archive of this tutorial.
u/santiagus-succ asked:
Anyway to know the combined word count of the fics you read?
Was curious but also dont want to spend an entire hour checking each specific word count, maybe there is an option I didnt notice?
u/nevereverevee replied:
This popped up in my email, and I thought "ahh yes, exactly my skill set." This solution is involved but it should get you about the right answer.
{"_id":"ao3_read_wordcount","startUrl":["https://archiveofourown.org/users/YOUR AO3USERNAME/readings?page=[1-NUMBER OF PAGES IN YOUR HISTORY]"],"selectors":[{"id":"fic element","type":"SelectorElement","parentSelectors":["_root"],"selector":"li.reading","multiple":true,"delay":0},{"id":"title","type":"SelectorText","parentSelectors":["fic element"],"selector":"h4 a:nth-of-type(1)","multiple":false,"regex":"","delay":0},{"id":"wordcount","type":"SelectorText","parentSelectors":["fic element"],"selector":"dd.words","multiple":false,"regex":"","delay":0},{"id":"author","type":"SelectorText","parentSelectors":["fic element"],"selector":"a:nth-of-type(2)","multiple":false,"regex":"","delay":0}]}
It won't count the words of any fics that have since been deleted, and if a fic has updated since you last read it, it will count those words too. It also doesn't know if you haven't read a whole fic (like if you opened it and then didn't read it, or only read a few chapters) But it'll get you pretty close!
Let me know if you try it!
1 firefox extension here
2 i made a separate version that works better for me by scraping bookmarks instead of history. if you're like me and you bookmark everything you complete, this will be a better solution for you. i've put together my sitemap below:
it grabs the title, category, rating, fandom, (first tagged) relationship, summary, author, word count, completion status, warnings, chapters, date Bookmarked, date posted (or last updated), and the work tags.
3 do NOT do this. before you begin scraping, adjust the request interval and page load delay to 30000 ms each (30 seconds). otherwise, you'll overload ao3 with requests and get a "retry later" page, however, the scraper will keep scraping, which will lead to incomplete data. you'll probably overload ao3 with requests once or twice this way too, but it's better than nothing. if you choose to go ahead with the default times, don't say i didn't warn you.